Intro

A few suggestions

  • Guidelines help 99% of us 99% of the time – diverge when you want to, but think hard about why.

  • Good figures need time and effort. Think, experiment, refine, proofread.

  • Make figures that your skills and time allow. Creating data visualizations is a career!

A few suggestions

  • Show as much information as possible, within reason.

  • Tweak guidelines for presentation medium (journal paper, presentation, magazine article, blog).

  • Labelling and captions often don’t get enough attention.

Bad graphs

Good graphs

Amazing graphs (that I could never make)

Tufte

  • Represent data faithfully – no “selling”

  • Maximize data-to-ink ratio, within reason.
    • do: show lots of information
    • don’t: show a lot of extraneous detail (chartjunk)


  • Clear, detailed, and thorough labeling

Plot elements

  • Data
  • Axes
  • Aspect ratio
  • Background
  • Legend
  • Labels
  • Caption

Data

Clarity and minimalism

The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the numerical quantities measured

The number of information-carrying (variable) dimensions depicted should not exceed the number of dimensions in the data

(Tufte)

Pie charts: not a good idea

Pie charts: not a good idea

Neither are stacked and group bar charts

Neither are stacked and group bar charts

Worse still…

Use points not lines if element order is not relevant.

Use points not lines if element order is not relevant.

What’s wrong here?

Show data variation, not design variation

Graphics should not quote data out of context

Graphics should not quote data out of context

Graphics should not quote data out of context

Convey groups clearly – colour, fill, facet

data("ChickWeight")
ChickWeight <- ChickWeight %>% mutate(Week = factor(1 + round(Time/7)))
head(ChickWeight)
##   weight Time Chick Diet Week
## 1     42    0     1    1    1
## 2     51    2     1    1    1
## 3     59    4     1    1    2
## 4     64    6     1    1    2
## 5     76    8     1    1    2
## 6     93   10     1    1    2

Using fill to show weight distributions for each diet

Using fill + colour to show weight distributions for each diet

Weight distributions for each week

Weight distributions for combinations of diet (fill) and week (colour)

Weight distributions for combinations of diet and week (with interaction)

Use fill for one grouping variable and facetting for the other

Try both ways – often gives interesting (different) perspectives

Try both ways – often gives interesting (different) perspectives

Avoid cross-hatching or other patterns that distract the mind from the information being presented

Axes

Axes should include or nearly include the range of data, with data filling up the plot

Axes should include or nearly include the range of data, with data filling up the plot

## Warning: Removed 101 rows containing missing values (geom_point).

Don’t insist that zero always be included

Don’t insist that zero always be included

Consider a log scale when data is over different scales or more important to understand % change

Consider a log scale when data is over different scales or more important to understand % change

Consider a log scale when data is over different scales or more important to understand % change

Don’t forget to specify units and label axes. Tick intervals should ideally be at nice round numbers.

Don’t forget to specify units and label axes. Tick intervals should ideally be at nice round numbers.

Don’t forget to specify units and label axes. Tick intervals should ideally be at nice round numbers.

Don’t forget to specify units and label axes. Tick intervals should ideally be at nice round numbers.

Don’t forget to specify units and label axes. Tick intervals should ideally be at nice round numbers.

Use same scales when graphs are compared

Use same scales when graphs are compared

Use same scales when graphs are compared

Think about whether to compare vertically or horizontally

Think about whether to compare vertically or horizontally

Easy with ggplot option facet_grid(. ~ suburb)

or vertical with facet_grid(suburb ~ .)

Aspect ratios

Can be suggested by data e.g. spatial, but otherwise try for ~3:2 aspect ratio

Can be suggested by data e.g. spatial, but otherwise try for ~3:2 aspect ratio

Can be suggested by data e.g. spatial, but otherwise try for ~3:2 aspect ratio

Prepare graphics in the final aspect ratio to be used. Never “copy-and-stretch”!

Prepare graphics in the final aspect ratio to be used. Never “copy-and-stretch”!

Background

Avoid dark shaded backgrounds

Avoid dark, dominating grid lines

Check that any very thin lines don’t disappear on resizing/printing

theme_bw() is a good default option

Legends

Avoid cluttered legends

Where possible, add labels directly to the elements of the plot rather than use a legend at all.

Use ggrepel package to avoid overlap between labels

If this won’t work, then keep the legend from obscuring the plotted data, and make it small and neat

Legend inside plot margins or outside? Data trumps legend. If blank regions near one or more corners, then inside. If not (or would obscure data) then outside

Labels

Write out explanations of the data on the graphic itself. Label important events in the data.

Avoid overlap as much as possible

Captions

Plots should be self-explanatory, so captions should be detailed.

Proofread carefully that any text (including the caption) doesn’t contradict what’s in the figure (integrated reporting approaches like R Markdown can help with this)

Other stuff

Specific table stuff

  • Think about number of digits to display
  • Don’t drop ending zeros
  • Avoid huge tables

From plot to paper

  • Make sure everything is readable after the figure is scaled

  • Consider vector graphics such as eps or pdf. These scale properly and do not look fuzzy when enlarged

  • Png or jpg better if many data points

Presentations v Papers

  • Will you explain everything on the plot? How much time do you have?

  • Is the unexplained stuff necessary? Is it obvious or confusing without explanation? Can labels help?

  • Usually better to go with simpler figures in presentations (esp tables)

Summing up

  • Spend time making figures that look good – massive help to getting your point across

  • Two main tasks – what to show (data) and then lots of finicky but important style stuff (axis labels, colour schemes, captions, etc). Need to get these both right.

  • Lots of guidelines and good examples – use these and develop your own sense of what looks good (within reason)